-
Notifications
You must be signed in to change notification settings - Fork 34
Update tropical_subseasonal set for eamxx output #982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update tropical_subseasonal set for eamxx output #982
Conversation
|
I may discovered a bug in xcdat that we need to fix for merging this PR. |
|
New issues when adding U_at_850hPa which has missing values. Error shown below: |
|
@jjbenedict hey Jim, in this PR, I'm working on have e3sm_diags to understand eamxx for tropical variability diagnostics. It works for precipitation and outgoing longwave so far, results can be found here. However the U850 output from eamxx has missing values (I think the EAM version output as extrapolation done at some point, thus no missing), and the wf_analysis function doesn't support data with missing values. See comment #982 (comment). At this point, I'm trying to come up with a way to treat the missing. Could you advise on this? Thanks! |
|
@chengzhuzhang Yes, the Fourier analyses require that all input fields must have no missing data. A simple and fast solution would be to linearly interpolate in U850 longitude to fill in the missing data "gaps". A more elegant solution might be to use something like a Poisson grid fill -- here's one version from NCL: https://www.ncl.ucar.edu/Document/Functions/Built-in/poisson_grid_fill.shtml I've used this NCL method previously, and it works well. Python must have something similar, though I haven't needed to use it yet. But to be honest, if the Fourier analysis is being applied to a smoothly varying field like U850 for daily averaged data, and we're plotting zonal wavenumbers -15 to +15, using the simple linear gap-fill would probably be sufficient. |
|
Hey Jim, thanks for your comment! Yes, agreed. I now realize if we only target 15N-15S, a simple interpolation method should just work fine..I will add a interpolation routine to work around missing data. Thanks! |
| # Interpolate missing values along longitude before spectral analysis | ||
| if np.any(np.isnan(x)): | ||
| logger.info( | ||
| "Interpolating missing values along longitude before spectral analysis" | ||
| ) | ||
| x = x.interpolate_na(dim="lon", method="linear", fill_value="extrapolate") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simple linear interpolation to fill in missing for U850
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this linear interpolation unintentionally affect other variables, or do we want other variables with missing values to be handled too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point. I think IMERG precipitation data also has some missing values. Though this gap-filing method seems to be low risk to affect results. I will keep an eye on image tests for potential impact.
|
@jjbenedict Now works for 3 variables (results see here) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Integrate EAMxx-specific NE256 workflow into tropical_subseasonal diagnostics by adding example scripts, refactoring the driver to use xarray datasets, and extending variable derivations.
- Add a Bash script to generate climatology, diurnal cycle, and time-series inputs for EAMxx NE256.
- Introduce a Python driver to run core, diurnal, and tropical_subseasonal diagnostics for EAMxx output.
- Refactor
tropical_subseasonal_driverto accept xarray datasets and update the spectrum calculation interface. - Extend the derivations mapping to support the
U850alias for 850 hPa zonal wind.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| examples/e3sm_diags_for_eamxx/test_eamxx_ne256.sh | New Bash workflow for generating EAMxx NE256 climatology and time-series inputs |
| examples/e3sm_diags_for_eamxx/run_e3sm_diags_eamxx256_climo.py | Driver script to run EAMxx-specific diagnostics including tropical_subseasonal |
| e3sm_diags/driver/tropical_subseasonal_driver.py | Refactor to use dataset objects and update calculate_spectrum signature |
| e3sm_diags/derivations/derivations.py | Add a derivation mapping for U850 alias |
Comments suppressed due to low confidence (3)
examples/e3sm_diags_for_eamxx/test_eamxx_ne256.sh:17
- [nitpick] The variable name
drcis ambiguous. Consider renaming it to something more descriptive likenative_climo_dir.
drc=${drc_out}/native/climo
e3sm_diags/driver/tropical_subseasonal_driver.py:55
- The variable
seasonis not defined in this scope. You need to loop over seasons or referenceparameter.seasons.
parameter.test_name_yrs = test_data.get_name_yrs_attr(season)
e3sm_diags/driver/tropical_subseasonal_driver.py:58
- The variable
run_typeis undefined here. Consider passing it in or using a parameter attribute for this condition.
if run_type == "model_vs_model":
4644db3 to
4fb0a3f
Compare
tomvothecoder
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @chengzhuzhang, I had one question about the interpolation. Other than that, looks good to me. Thanks.
| parameter.test_end_yr, | ||
| ) | ||
| # Get test dataset | ||
| test_ds = test_data.get_time_series_dataset(variable, single_point=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand, this line replaces the I/O that was being performed in calculate_spectrum to get the variable. This looks much cleaner now
| parameter.ref_start_yr, | ||
| parameter.ref_end_yr, | ||
| ) | ||
| ref_ds = ref_data.get_time_series_dataset(variable, single_point=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this line too.
|
|
||
| # Extract variable data from dataset and subset to tropical latitudes | ||
| var = ds[variable].sel(lat=slice(-15, 15)) | ||
|
|
||
| # Get actual time range from the data | ||
| actual_start = var.time.dt.year.values[0] | ||
| actual_end = var.time.dt.year.values[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calculate_spectrum looks a lot simpler than before!
| # Interpolate missing values along longitude before spectral analysis | ||
| if np.any(np.isnan(x)): | ||
| logger.info( | ||
| "Interpolating missing values along longitude before spectral analysis" | ||
| ) | ||
| x = x.interpolate_na(dim="lon", method="linear", fill_value="extrapolate") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this linear interpolation unintentionally affect other variables, or do we want other variables with missing values to be handled too?
|
@tomvothecoder thanks for the review! |
Description
Checklist
If applicable: